Ontologising Relational Triples into a Portuguese Thesaurus
نویسندگان
چکیده
Having in mind the automatic acquisition and integration of knowledge from different heterogeneous resources, this paper proposes several automatic methods for attaching term-based relational triples to the synsets of a thesaurus, without exploiting the extraction context for disambiguation. After using the proposed methods to attach triples, extracted from a Portuguese dictionary, to the synsets of a Portuguese thesaurus, two perspectives on their performance are given. The resulting synset-based triples were automatically validated based on support found in corpus and were then evaluated based on a handcrafted gold resource.
منابع مشابه
Ontologising Semantic Relations into a Relationless Thesaurus
tb-triple = {planta part-of floresta} (plant part-of forest) A1: relação, quadro, planta, mapa B1: bosque, floresta, mata, brenha, selva A2: vegetal, planta A3: traçado, desenho, projeto, planta, plano plausible sb-triples = {A2, B1} tb-triple = {passageiro purpose-of carruagem} (passenger purpose-of carriage) A1: passageiro, viajante B1: carriagem, carruagem, carraria A2: passageiro, viador B2...
متن کاملDistributional Thesaurus vs. WordNet: A Comparison of Backoff Techniques for Unsupervised PP Attachment1
Prepositional Phrase (PP) attachment can be addressed by considering frequency counts of dependency triples seen in a non-annotated corpus. However, not all triples appear even in very big corpora. To solve this problem, several techniques have been used. We evaluate two different backoff methods, one based on WordNet and the other on a distributional (automatically created) thesaurus. We work ...
متن کاملTethering Cultural Data with RDF
This paper describes a research project leading to the development of a data querying application based on the Jena RDF store. The starting point is a collection of relational databases holding cultural heritage material from the National Collections of Scotland. This source data is a mixture of fixed fields and free text, supported by background material such as domain thesauri. The aim is to ...
متن کاملUpdate-Enabled Triplification of Relational Data into Virtual RDF Stores
The current buzzword in the Internet community is the Semantic Web initiative proposed by the W3C to yield a Web that is more flexible and self-adapting. However, for the Semantic Web initiative to become a reality, heterogeneous data sources need to be integrated in order to enable access to them in a homogeneous manner. Since a vast majority of data currently resides in relational databases, ...
متن کاملAUSEABED Article
An overwhelming amount of data on seabed materials is in the form of wordbased descriptions. With the rapid growth of mapping systems, relational databases and numerical modelling tools, a strong need has arisen for ways to transform that data into numerics which are much better suited to computerised display and analysis systems. A new technique for doing this draws on the concepts and process...
متن کامل